Towards a multi-layered dependency annotation of Finnish

نویسندگان

  • Alicia Burga
  • Simon Mille
  • Anton Granvik
  • Leo Wanner
چکیده

We present a dependency annotation scheme for Finnish which aims at respecting the multilayered nature of language. We first tackle the annotation of surfacesyntactic structures (SSyntS) as inspired by the Meaning-Text framework. Exclusively syntactic criteria are used when defining the surface-syntactic relations tagset. Our annotation scheme allows for a direct mapping between surface-syntax and a more semantics-oriented representation, in particular predicate-argument structures. It has been applied to a corpus of Finnish, composed of 2,025 sentences related to weather conditions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards a Dependency-based PropBank of General Finnish

In this work, we present the first results of a project aiming at a Finnish Proposition Bank, an annotated corpus of semantic roles. The annotation is based on an existing treebank of Finnish, the Turku Dependency Treebank, annotated using the well-known Stanford Dependency scheme. We describe the use of the dependency treebank for PropBanking purposes and show that both annotation layers prese...

متن کامل

Dependency Annotation of Wikipedia: First Steps Towards a Finnish Treebank

In this work, we present the first results obtained during the annotation of a general Finnish treebank in the Stanford Dependency scheme. We find that the scheme is a suitable syntax representation for Finnish, with only minor modifications needed. The treebank is based on text from the Finnish Wikipedia, ensuring its free distribution and broad topical variance. To assess the suitability of W...

متن کامل

An annotation scheme for Persian based on Autonomous Phrases Theory and Universal Dependencies

A treebank is a corpus with linguistic annotations above the level of the parts of speech. During the first half of the present decade, three treebanks have been developed for Persian either originally or subsequently based on dependency grammar: Persian Treebank (PerTreeBank), Persian Syntactic Dependency Treebank, and Uppsala Persian Dependency Treebank (UPDT). The syntactic analysis of a sen...

متن کامل

Adding multi-layer semantics to the Greek Dependency Treebank

In this paper we give an overview of the approach adopted to add a layer of semantic information to the Greek Dependency Treebank [GDT]. Our ultimate goal is to come up with a large corpus, reliably annotated with rich semantic structures. To this end, a corpus has been compiled encompassing various data sources and domains. This collection has been preprocessed, annotated and validated on the ...

متن کامل

A Multi-Representational and Multi-Layered Treebank for Hindi/Urdu

This paper describes the simultaneous development of dependency structure and phrase structure treebanks for Hindi and Urdu, as well as a PropBank. The dependency structure and the PropBank are manually annotated, and then the phrase structure treebank is produced automatically. To ensure successful conversion the development of the guidelines for all three representations are carefully coordin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015